eBPF: Orchestrating the Kernel’s Secret Symphony

15 October 2023 — Hacking

Introduction

In 2016, a revolutionary tool emerged in the Linux ecosystem, paving the way for a new level of kernel control and monitoring: eBPF (Extended Berkeley Packet Filter). This technology lets you interact directly with the Linux kernel by means of eBPF programs, written in a bytecode format. Thanks to its portability, these programs are compiled just-in-time (JIT) and executed in the kernel via the BPF virtual machine.

The Mysterious BPF Virtual Machine

To understand how these eBPF programs run at the core of the kernel, we first need to delve into the enigmatic BPF virtual machine (BPF VM). This VM is 64-bit, based on RISC (Reduced Instruction Set Computing) principles, and has its own registers.

RISC systems are characterized by a reduced instruction set, in contrast to CISC (Complex Instruction Set Computing) architectures, which feature a broader command set. Some CPUs “emulate” a RISC subset, while others fully embrace one approach or the other. For instance, Apple’s A6 chip or NVIDIA’s Tegra 3 rely on the ARM (Cortex-A9) architecture, a typical RISC design.

Thanks to this BPF VM, BPF programs are injected and executed directly in the kernel, with access restricted to certain features and a portion of memory. Note that this VM is nothing like KVM (Kernel-based Virtual Machine), which requires additional modules to manage virtualized environments. By contrast, eBPF is fully integrated into the kernel space.

The Big Question: Why eBPF?

You might wonder why deploy eBPF programs if the same tasks can be carried out via kernel modules. The answer largely lies in security.

Developing kernel modules is notoriously risky. Numerous vulnerabilities stemming from imperfectly designed modules have tarnished the Linux kernel’s security reputation. For example, a buffer overflow in a module can enable an attacker to escalate privileges to root by manipulating task credentials, even when advanced defenses like ASLR, restricted execution, etc., are in place.

In the Linux kernel, every program or process is represented by an entity known as a task. Certain internal functions can raise a task’s privileges (UID 0 / GID 0) through calls such as prepare_kernel_cred(0) and commit_creds.

By contrast, an eBPF program is designed for greater safety:

It is finite (not Turing-complete); no infinite loops are possible, ensuring a guaranteed endpoint.
It works through hooks (injection points) placed directly in the kernel, triggered by specific events.
It runs as bytecode, checked by the BPF VM itself before being deployed.

BPF Programs: Discovering the Arsenal

BPF programs are generally split into two main categories: tracing and networking.

Tracing

These programs enable you to observe system and kernel activity, providing access to process memory, resources (file descriptors, memory regions, CPUs, etc.), and the ability to log internal operations. This makes it possible to monitor how applications or the kernel behave, to optimize performance or detect anomalies.

Networking

As its name suggests, this category deals with network traffic management. A BPF program can be attached to various points of the network stack: just before sending or receiving a packet, or before passing data to userspace. There are many possibilities here for controlling, filtering, or even modifying traffic in real time.

incredible meme

Examples of BPF Program Types

Socket Filter
Allows inspection of all packets passing through a socket without altering their content or final routing.
- Declared in code by the constant: _BPF_PROG_TYPE_SOCKET_FILTER_
Kprobe
Activates “kprobes,” that is, dynamic hook points (breakpoints) within kernel code, enabling observation of internal kernel functions.
- Referred to as: _BPF_PROG_TYPE_KPROBE_
XDP (Express Data Path)
Operates right at the start of packet processing in the network driver, before it even reaches the kernel network stack. An XDP program can decide to pass a packet (XDP_PASS), drop it (XDP_DROP), and so on. This grants extremely powerful control over traffic and is often used to build DDoS mitigation or load-balancing components.
- Identified by: _BPF_PROG_TYPE_XDP_

Many other program types exist, for instance, those that focus on cgroups, performance events, or tracepoints. A more comprehensive list is available on lwn.net.

The BPF Verifier: The Unfailing Sentry

Running code in kernel space might seem risky—especially in production. That’s where the BPF verifier comes in, acting as a watchdog to ensure your program is safe and aligns with the BPF model.

Graph Analysis: The verifier converts the code into a directed acyclic graph, where each instruction is a node linked to other nodes representing the different possible execution paths.
Hunt for Anomalies: It searches for issues like infinite loops, recursion, exceeding the maximum instruction count (4096), dead code, access violations, etc.
Isolated Test Execution: Finally, the code is run in a confined environment to confirm that each instruction is valid and that there are no pointer or addressing errors. If the program reaches the BPF_EXIT instruction, the verifier gives it a green light.

You can ask the bpf syscall to provide more information on the verification process to help troubleshoot why a program might be rejected.

In Conclusion

We’ve barely scratched the surface of the eBPF world, without diving into advanced concepts like BTF (BPF Type Format) or BPF Tail Calls. These features further extend the system’s capabilities, for instance by letting you bypass the 4096-instruction limit (through chaining multiple programs) or by providing more detailed debugging metadata.

In short, remember that eBPF is a remarkable tool for monitoring, observability, and even traffic interception at the very heart of the kernel—without the high risk of vulnerabilities often associated with traditional kernel modules. Future articles will delve deeper into these topics to shed even more light on the mysteries of eBPF.

See you soon, and enjoy exploring this fascinating realm where the kernel is the ultimate playground!